Skip to content

oci: extract artifact-agnostic primitives into oci/artifact (Phase 0, THV-0077)#133

Open
JAORMX wants to merge 1 commit into
mainfrom
feat/oci-artifact-shared-primitives
Open

oci: extract artifact-agnostic primitives into oci/artifact (Phase 0, THV-0077)#133
JAORMX wants to merge 1 commit into
mainfrom
feat/oci-artifact-shared-primitives

Conversation

@JAORMX

@JAORMX JAORMX commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Phase 0 of plugin support (RFC THV-0077)

Extracts the artifact-agnostic OCI primitives out of oci/skills into a new oci/artifact package so the upcoming oci/plugins package can reuse them. Behavior-preserving refactor — no logic changes.

What moved to oci/artifact

  • tarCreateTar/ExtractTar/ExtractTarWithLimit, FileEntry, TarOptions/DefaultTarOptions, MaxTarFileSize, path-traversal/symlink rejection
  • gzipCompress/Decompress/DecompressWithLimit, CompressTar/DecompressTar, GzipOptions/DefaultGzipOptions, MaxDecompressedSize (decompression-bomb guard)
  • platformPlatformString/ParsePlatform/DefaultPlatforms + OS/Arch consts
  • pull-hardening — promoted to exported ValidatingTarget/NewValidatingTarget (+ manifest/blob size caps and manifest-count limits, digest verification)

Backward compatibility

oci/skills re-exports every moved symbol via type aliases (type X = artifact.X), var-forwarding for funcs, and const re-declaration (oci/skills/artifact_aliases.go). The public surface of oci/skills is unchanged — no exported signature changed, and downstream consumers (toolhive) are unaffected. The skills determinism tests still assert byte-stable artifact digests, proving published skill digests don't change.

Notes

  • Skill-specific code (media types, labels, SkillConfig, Packager, Registry, Store) stays in oci/skills.
  • New package is Alpha (per CLAUDE.md stability tracks).
  • This phase precedes GATE-C1 (tag toolhive-core + bump toolhive's go.mod), which is tracked in the epic.

Closes #130
Part of stacklok/toolhive#5525
RFC: stacklok/toolhive-rfcs#77

🤖 Generated with Claude Code

Phase 0 of plugin support (THV-0077). Move the artifact-agnostic OCI
primitives out of oci/skills into a new oci/artifact package so a future
oci/plugins can reuse them:

- tar (CreateTar/ExtractTar*/FileEntry/TarOptions, MaxTarFileSize)
- gzip (Compress/Decompress*/CompressTar/DecompressTar, MaxDecompressedSize)
- platform helpers (PlatformString/ParsePlatform/DefaultPlatforms, OS/Arch)
- pull-hardening, exported as ValidatingTarget/NewValidatingTarget plus the
  manifest/blob size caps and manifest-count limits

oci/skills re-exports every moved symbol via type aliases, var-forwarding,
and const re-declaration (oci/skills/artifact_aliases.go), so the package's
public surface is unchanged and downstream consumers (toolhive) are
unaffected — no exported signature changed. Behavior-preserving move:
function bodies are identical and the oci/skills determinism tests still
assert byte-stable artifact digests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@jhrozek jhrozek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batched review from /review-iterate on the oci/artifact extraction. These are non-blocking (the refactor itself is sound — build/vet/test/lint/license all pass and the security posture is preserved). Findings cluster on the back-compat alias layer for an Alpha package, plus two pre-existing path-handling issues carried over by the tar.go rename. Test-coverage gaps from the review were noted but not commented inline.

artifact "github.com/stacklok/toolhive-core/oci/artifact"
)

// This file re-exports the artifact-agnostic OCI primitives that were extracted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Is the back-compat alias layer worth it for an Alpha package?

oci/skills is marked Alpha in CLAUDE.md ("breaking changes possible"), so source compatibility isn't owed to callers yet. This layer (13 func forwarders + 3 type aliases + 7 const re-exports, plus the test-only mirror file) exists purely to avoid updating a handful of first-party importers (toolhive, dockyard, registries) — and it is the root cause of the duplicated-test/mirror-constant issue flagged in artifact_aliases_test.go.

Consider dropping the layer and migrating importers to oci/artifact directly. If a transition window is genuinely needed, keep it but mark the forwarders // Deprecated: so go doc/linters nudge migration — and still delete the duplicated skills-side tests.

const gzipOSUnknown = 255

// maxIndexManifests mirrors the artifact cap on manifests in an image index.
const maxIndexManifests = 32

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Duplicated test suites + drifting hardcoded mirror constants

These constants (maxIndexManifests = 32, maxManifestLayers = 64, gzipOSUnknown = 255) are hardcoded copies of values that are unexported in oci/artifact. There is no compiler link between them. If the real cap in oci/artifact changes (e.g. 32 -> 16), the skills-side boundary tests at registry_test.go:176,191 (maxIndexManifests+1) keep testing the stale value and can silently false-pass.

Root cause: the tar/gzip/validation tests were copied verbatim into both oci/artifact and oci/skills; the skills copies only assert "the forwarder forwards." Recommend deleting the moved-logic tests from oci/skills (gzip_test.go, tar_test.go, and the manifest-count/validating-target cases in registry_test.go) — oci/artifact now owns them — which removes the need for these mirror constants entirely.

Comment thread oci/artifact/validate.go
// MaxBlobSize is the maximum size of a blob (100MB).
const MaxBlobSize int64 = 100 * 1024 * 1024

// maxIndexManifests is the maximum number of manifests in an image index.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Half-open export surface

ValidatingTarget, ValidateManifestCounts, IsManifestMediaType, MaxManifestSize, and MaxBlobSize are exported, but the caps actually enforced here (maxIndexManifests, maxManifestLayers) stay unexported — so a caller of the exported ValidateManifestCounts can't introspect or configure the limits it applies. The only entry point registry.go actually uses is NewValidatingTarget.

Suggest deciding intentionally: either (a) unexport ValidateManifestCounts/IsManifestMediaType (shrinking the surface to NewValidatingTarget), or (b) commit to a public validation API and also export MaxIndexManifests/MaxManifestLayers. (a) is more honest to the single real use site — the two functions appear exported mainly to satisfy the moved skills tests.

Comment thread oci/artifact/tar.go
// SPDX-License-Identifier: Apache-2.0

package skills
package artifact

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two pre-existing path-handling issues in this renamed file (carried over verbatim from oci/skills, not introduced here — flagging since the move is a reasonable moment to fix). The relevant code isn't in the rename's diff hunk, hence anchored here.

[MEDIUM] validateTarPath (line ~161) uses path, not path/filepath — Windows \ bypass. path.Clean only treats / as a separator, so path.Clean("foo\\..\\..\\etc\\passwd") is returned unchanged and slips past the HasPrefix(cleaned, "..") check on a Windows host. The same prefix check also false-rejects a legit file named ..hidden. Fix: reject any \ (OCI/POSIX mandate /) and tighten to cleaned == ".." || strings.HasPrefix(cleaned, "../").

[LOW] ExtractTar (line ~151) stores raw hdr.Name, not the cleaned path. Validation runs first, but the uncleaned name is stored in FileEntry.Path, so a caller doing string concatenation (rather than filepath.Join) could re-introduce surprising paths from e.g. foo/./bar. Fix: store path.Clean(hdr.Name).

Severity is MEDIUM/LOW because this library targets Linux/k8s and ExtractTar returns in-memory entries (no FS write).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[plugins] Phase 0: extract artifact-agnostic OCI primitives (THV-0077)

2 participants